Search results for "data compression"

showing 10 items of 99 documents

Text Compression Using Antidictionaries

1999

International audience; We give a new text compression scheme based on Forbidden Words ("antidictionary"). We prove that our algorithms attain the entropy for balanced binary sources. They run in linear time. Moreover, one of the main advantages of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and allows parallel compression. Our algorithms can also be presented as "compilers" that create compressors dedicated to any previously fixed source. The techniques used in this paper are from Information Theory and Finite Automata.

Theoretical computer scienceFinite-state machineComputer science[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]010102 general mathematicsforbidden wordData_CODINGANDINFORMATIONTHEORY0102 computer and information sciencesInformation theory01 natural sciencesfinite automatonParallel compressionpattern matching010201 computation theory & mathematicsEntropy (information theory)Pattern matching0101 mathematicsTime complexityAlgorithmdata compressioninformation theoryData compression
researchProduct

Suffix Array Construction on Multi-GPU Systems

2019

Suffix arrays are prevalent data structures being fundamental to a wide range of applications including bioinformatics, data compression, and information retrieval. Therefore, various algorithms for (parallel) suffix array construction both on CPUs and GPUs have been proposed over the years. Although providing significant speedup over their CPU-based counterparts, existing GPU implementations share a common disadvantage: input text sizes are limited by the scarce memory of a single GPU. In this paper, we overcome aforementioned memory limitations by exploiting multi-GPU nodes featuring fast NVLink interconnects. In order to achieve high performance for this communication-intensive task, we …

Multi-core processorSpeedupComputer scienceSuffix array0102 computer and information sciences02 engineering and technologyParallel computingData structure01 natural scienceslaw.inventionCUDAShared memory010201 computation theory & mathematicslaw0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingSuffixData compressionProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing
researchProduct

The Myriad Virtues of Wavelet Trees

2009

Wavelet Trees have been introduced in [Grossi, Gupta and Vitter, SODA '03] and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compressors. Although several papers have investigated the beauty and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a complete theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. We also show how to improve their asymptotic performance by introducing a novel framework, called Generalized Wavelet Trees, that aims for the best combination of binary compressors (like,…

Binary treeWeight-balanced treeWavelet transformCascade algorithmData_CODINGANDINFORMATIONTHEORYHuffman codingData CompressionTheoretical Computer ScienceComputer Science ApplicationsSet partitioning in hierarchical treessymbols.namesakeWaveletComputational Theory and Mathematicssymbolsempirical entropyBurrows-Wheeler TransformAlgorithmData compressionMathematicsInformation SystemsWavelet Trees
researchProduct

On parsing optimality for dictionary-based text compression—the Zip case

2013

Dictionary-based compression schemes are the most commonly used data compression schemes since they appeared in the foundational paper of Ziv and Lempel in 1977, and generally referred to as LZ77. Their work is the base of Zip, gZip, 7-Zip and many other compression software utilities. Some of these compression schemes use variants of the greedy approach to parse the text into dictionary phrases; others have left the greedy approach to improve the compression ratio. Recently, two bit-optimal parsing algorithms have been presented filling the gap between theory and best practice. We present a survey on the parsing problem for dictionary-based text compression, identifying noticeable results …

Theoretical computer scienceComputer scienceData_CODINGANDINFORMATIONTHEORYTop-down parsingcomputer.software_genreTheoretical Computer ScienceParsing optimalityCompression (functional analysis)Discrete Mathematics and CombinatoricsLossless compressionParsingLZ77 algorithmSettore INF/01 - InformaticaDeflate algorithmbusiness.industryDictionary-based text compressionComputational Theory and MathematicsData compressionDEFLATECompression ratioArtificial intelligencebusinesscomputerNatural language processingBottom-up parsingData compressionJournal of Discrete Algorithms
researchProduct

Image Processing Chain For Digital Still Cameras Based On The Simpil Architecture

2005

The new generation of wireless devices herald the development of products for integrated portable image and video communication requiring to image and video applications high computing performance. Portable MultiMedia Supercomputers (PMMS), a new class of architectures, allow to combine high computational performance, needed by multimedia applications, and a big energy efficiency, needed by portable devices. Among PMMS, the SIMPil (SIMD processor pixel) architecture satisfies the above requirements, especially with video and digital images processing tasks. In this paper we, exploit the SIMPil computation and throughput efficiency to implement the whole image processing chain of a digital s…

Bayer filterPixelbusiness.industryComputer scienceComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONDigital photographyImage processingStill cameraDigital imageimage processing sensorDigital image processingbusinessComputer hardwareDigital signal processingData compression
researchProduct

The rightmost equal-cost position problem.

2013

LZ77-based compression schemes compress the input text by replacing factors in the text with an encoded reference to a previous occurrence formed by the couple (length, offset). For a given factor, the smallest is the offset, the smallest is the resulting compression ratio. This is optimally achieved by using the rightmost occurrence of a factor in the previous text. Given a cost function, for instance the minimum number of bits used to represent an integer, we define the Rightmost Equal-Cost Position (REP) problem as the problem of finding one of the occurrences of a factor whose cost is equal to the cost of the rightmost one. We present the Multi-Layer Suffix Tree data structure that, for…

FOS: Computer and information sciencesOffset (computer science)Computer scienceSuffix treeComputer Science - Information Theorylaw.inventionCombinatoricslawLog-log plotComputer Science - Data Structures and AlgorithmsCompression schemetext compressiondictionary text compressionData Structures and Algorithms (cs.DS)LZ77 compressiondata compressionLossless compressionfull text indexSuffix Tree Data StructuresSettore INF/01 - InformaticaInformation Theory (cs.IT)Data structurePrefixCompression ratioCompression scheme; Constant time; Suffix Tree Data StructuresAlgorithmData compressionConstant time
researchProduct

Image Compression by 2D Motif Basis

2011

Approaches to image compression and indexing based on extensions to 2D of some of the Lempel-Ziv incremental parsing techniques have been proposed in the recent past. In these approaches, an image is decomposed into a number of patches, consisting each of a square or rectangular solid block. This paper proposes image compression techniques based on patches that are not necessarily solid blocks, but are affected instead by a controlled number of undetermined or don't care pixels. Such patches are chosen from a set of candidate motifs that are extracted in turn from the image 2D motif basis, the latter consisting of a compact set of patterns that result from the autocorrelation of the image w…

Pixelbusiness.industryComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONPattern recognitionData_CODINGANDINFORMATIONTHEORYcomputer.file_formatJPEGImage (mathematics)Compression (functional analysis)Motif extraction Pattern discoveryArtificial intelligencebusinessAlgorithmcomputerImage compressionData compressionMathematicsColor Cell CompressionBlock (data storage)2011 Data Compression Conference
researchProduct

Imaging of Temporomandibular Joint: Approach by Direct Volume Rendering

2014

Background: The purpose of this study was to conduct a morphological analysis of the temporomandibular joint, a highly specialized synovial joint that permits movement and function of the mandible. Materials and Methods: We have studied the temporom-andibular joint anatomy, directly on the living, from 3D images obtained by medical imaging Computed Tomography and Nuclear Magnetic Resonance acquisition, and subsequent re-engineering techniques 3D Surface Rendering and Volume Rendering. Data were analysed with the goal of being able to isolate, identify and distinguish the anatomical structures of the joint, and get the largest possible number of information utilizing software for post-proces…

musculoskeletal diseasesClinical Biochemistrylcsh:MedicineMechanical engineeringMagnetic resonance imagingstomatognathic systemSettore MED/28 - Malattie OdontostomatologicheSynovial jointmedicineMedical imagingTransparency (data compression)Zoommedicine.diagnostic_testbusiness.industrySettore BIO/16 - Anatomia UmanaMedicine (all)lcsh:RMandibleMagnetic resonance imagingVolume renderingGeneral MedicineTMJDentistry SectionThree-dimensional ImagingTemporomandibular jointComputer generatedstomatognathic diseasesmedicine.anatomical_structurebusinesscomputer generated; magnetic resonance imaging; thred-dimensional imaging; TMJBiomedical engineering
researchProduct

Analog joint source-channel Multiple Description coding scheme over AWGN parallel channels

2011

We propose a low complexity analog joint source channel coding Multiple Description (MD) scheme for transmitting the symbols of a Gaussian source across a pair of independent AWGN channels. The outputs of these channels have each a separated receiver, whereas a third receiver has both outputs available. At the transmitter side, a pair of bandwidth-reduction analog mappings are used for joint source-channel coding. The presented scheme has the inherent advantage over digital MD schemes based on separation, that coding and decoding can be performed by using a single-letter (or symbol), a strategy that is very suitable for applications where latency originated by the digital compression and th…

Channel codeTheoretical computer scienceComputer scienceMultiple description codingVariable-length codeData_CODINGANDINFORMATIONTHEORYsymbols.namesakeShannon–Fano codingAdditive white Gaussian noisesymbolsAlgorithmDecoding methodsComputer Science::Information TheoryCommunication channelData compression2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
researchProduct

A New Combinatorial Approach to Sequence Comparison

2008

In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we give a general method for comparing sequences that takes into account how much the characters coming from the different input sequences are mixed in the output…

MultisetTheoretical computer scienceBurrows–Wheeler transformSettore INF/01 - InformaticaComputer scienceBurrows-Wheeler transform; Sequence comparisonString (computer science)Context (language use)Extension (predicate logic)ComparisonInformation theoryGenomeBurrows-Wheeler transform; ComparisonTheoretical Computer ScienceTransformation (function)CategorizationComputational Theory and MathematicsPhylogeneticsSequence comparisonTheory of computationBurrows-Wheeler TransformSequence ComparisonAlgorithmMathematicsData compression
researchProduct